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A PROCESS FOR IDENTIFYING PHARMACEUTIC ALLY ACTIVE 
AGENTS USING AN EPITOPE-TAGGED LIBRARY 

FIELD OF THE INVENTION 
The present invention relates to an improved process for identifying 
pharmaceutical^ active agents. 



10 BACKG RO U ND QF T HE I NVE NTIO N 

Over the past ten years, there has been a growing demand for the production 
and identification of agents that have pharmacological activity as, for example, 
agonists or antagonists of various cellular acceptor molecules, such as cell-surface 
receptors, enzymes or antibodies. In the continuing search for new chemical 

IS moieties that can effectively modulate a variety of biological processes, the standard 
method for conducting a search is to screen a variety of pre-existing chemical 
moieties, for example, naturally occurring compounds or compounds which exist in 
synthetic libraries or databanks. The biological activity of the pre-existing chemical 
moieties is determined by applying the moieties to an assay which has been 

20 designed to test a particular property of the chemical moiety being screened, for 
example, a receptor binding assay which tests the ability of the moiety to bind to a 
particular receptor site. 

In an effort to reduce the time and expense involved in screening a large 
number of randomly chosen compounds for biological activity, several 

25 developments have been made to provide libraries of compounds for the discovery 
of lead compounds. The chemical generation of molecular diversity has become a 
major tool in the search for novel lead structures. Currently, the known methods for 
chemically generating large numbers of molecularly diverse compounds generally 
involve the use of solid phase synthesis, in particular to synthesize and identify 

30 peptides and peptide libraries. See, for example, Lebl et al., Int. J. Pept. Prot. Res., 
41, p. 201 (1993) which discloses methodologies providing selectively cleavable 
linkers between peptide and resin such that a certain amount of peptide can be 



WO 96/24847 



PCT/US96/02490 



liberated from the resin and assayed in soluble form while some of the peptide still 
remains attached to the resin, where it can be sequenced; Lam et al., Nature, 354, p. 
82 (1991) and (WO 92/00091) which disclose a method of synthesis of linear 
peptides on a solid support such as polystyrene or polyacrylamide resin; Geysen et 
5 aL, /. Immunol. Meth., 102, p. 259 (1987) which discloses the synthesis of peptides 
on derivatized polystyrene pins which arc arranged on a block in such a way that 
they correspond to the arrangement of wells in a 96-well microtiter plate; and 
Houghten et ah, Nature, 354, p. 84 (1991) and WO 92/09300 which disclose an 
approach to de novo determination of antibody or receptor binding sequences 

10 involving soluble peptide pools. 

Nonpeptidic organic compounds, such as peptide mimetics, can often surpass 
peptide ligands in affinity for a certain receptor of enzyme. An effective strategy 
for rapidly identifying high affinity biological ligands, and ultimately new and 
important drugs, requires rapid construction and screening of diverse libraries of 

15 non-peptidic structures containing a variety of structural units capable of 

establishing one or more types of interactions with a biological acceptor (e.g., a 
receptor or enzyme), such as hydrogen bonds, salt bridges, pi-complexation, 
hydrophobic effects, etc. However, work on the generation and screening of 
synthetic test compound libraries containing nonpeptidic molecules is now in its 

20 infancy. One example from this area is the work of Ellman and Bunin on a 

combinatorial synthesis of benzodiazepines on a solid support (J. Am. Chem. Soc. 
114, 10997, (1992); see Chemical and Engineering News . January 18, 1993, page 
33). 

Historically, the process to identify pharmacologically active agents has been 
25 characterized as time-consuming, labor-intensive and inefficient and usually 

involves a single biological target per screening effort. Further, efforts to improve 
the process have focused entirely of increasing the number and quality of agents per 
screening effort. Until now there has been little if any emphasis on improving the 
efficiency with which biological targets are utilized in screening assays. 
30 Additionally, efforts are underway to fully characterize the human genome. 

This initiative has already produced thousands of novel gene sequences, many of 
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which arc of unknown function. To identify which of these genes present 
opportunities for pharmaceutical intervention would take years of diligent effort using 
current methodology. Presently there is a need for improved methods for efficiently 
identifying pharmaceutically active agents and for better utilization of information 
5 resulting from the human genome initiative. 

Many of the disadvantages of the known methods as well as many of the 
needs not met by them are addressed by the present invention which, as described 
more fully hereinafter, provides numerous advantages over the known methods. 

10 SUMMARY O F THE INVENTION 

This invention relates to a process for identifying pharmaceutically active 
agents which comprises simultaneously expressing a plurality of Uniquely Tagged 
Target Agents to form a Target library; preparing an Agent Candidate Pool and 
testing the Target Library and the Agent Candidate Pool in an assay which identifies 
15 agents of the candidate pool having desired characteristics and to pharmaceutically 
active agents identified by such process. 

This invention also relates to a process for simultaneously expressing a 
plurality of uniquely tagged genes to form a Target Library. 

This invention also relates to a process for simultaneously expressing a 
20 plurality of uniquely tagged gene products to form a Target Library. 

This invention also relates to a process for simultaneously expressing a 
plurality of uniquely tagged membrane-bound receptors to form a Target Library. 

This invention also relates to a Target library comprising a plurality of 
expressed uniquely tagged genes. 
25 This invention also relates to a Target library comprising a plurality of 

expressed uniquely tagged gene products. 

This invention also relates to a Target library comprising a plurality of 
expressed uniquely tagged membrane-bound receptors. 

This invention also relates to Spatially Encoded Target Libraries. 
30 This invention also relates to methods of screening for pharmaceutical activity 

utilizing Spatially Encoded Target libraries. 
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DETAILED DESCRIPTION OF THE INVENTION 
As used herein, the term "Agent Candidate Pool" means a source of one or 
more entities selected from the group consisting of: chemical compounds, such as 
5 mixtures of individual compounds, peptides, natural products and monoclonal 
antibodies which are to be tested for pharmaceutical activity. These entities are 
optionally tested as combinatorial compound libraries, combinatorial peptide libraries 
or variomers. 

As used herein, the term 'Target Library" means a plurality of expressed 
1 0 Uniquely Tagged Target Agents. 

As used herein, the term "Uniquely Tagged Target Agent" means uniquely 
tagged genes, uniquely tagged gene products or uniquely tagged membrane-bound 
receptors. 

As used herein, the term "gene products" means a protein produced by the 

15 expression of a gene. For example, an enzyme, a receptor or a docking protein such 
as an SH2 domain. 

As used herein, the term "uniquely tagged gene(s)" means that, prior to 
expression, each gene that will make up the Target Library is engineered with a 
distinct maricer such as a short peptide sequence (hereinafter epitope tag) which is 

20 capable of recognition by specified antibodies. Said epitope tags can be 

manufactured in large number. Additionally, several genes can be grouped with the 
same tag, subgrouped with different tag combinations or tandem tags can be 
assembled to allow sequential selection. 

As used herein, the term "uniquely tagged gene product(s)" means that, prior 

25 to expression, each gene product that will make up the Target Library is engineered 
with a distinct maricer such as a short peptide sequence (hereinafter epitope tag) 
which is capable of recognition by specified antibodies. Said epitope tags can be 
manufactured in large number. Additionally, several gene products can be grouped 
with the same tag, subgrouped with different tag combinations or tandem tags can be 

30 assembled to allow sequential selection. 
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As used herein, the term "uniquely tagged membrane-bound receptor(s)" 
means that, prior to expression, each receptor that will make up the Target Library is 
engineered with a distinct marker such as a short peptide sequence (hereinafter 
epitope tag), which is capable of recognition by specified antibodies, and then 
5 incorporated into suitable recipient cells. Said epitope tags can be manufactured in 
large number. Additionally, several receptors can be grouped with the same tag, 
subgrouped with different tag combinations or tandem tags can be assembled to 
allow sequential selection. Incorporation of one target receptor type per cell is 
preferred, as this will enable adjustment of the relative composition of the Target 

10 Library by mixing together different numbers of cells. This cell mixture will 
constitute the Target Library in this example. 

Many receptors require a cell membrane for proper binding and functional 
responses to ligands, for example 7 transmembrane G-protein coupled receptors. As 
used herein, receptors which require a cell membrane for proper binding and 

15 functional response and which are incorporated into a cell are referred to as 
"membrane-bound receptors". Generally, applying Target Library screening 
approaches to membrane-bound types of receptors can be performed as indicated 
herein. A preferred method of screening membrane-bound receptors is to use one or 
more chemical compounds or a combinatorial chemical library as the Agent 

20 Candidate Pool and provide a photoactivatable group in the template during the 

synthesis of the compounds or combinatorial library. Examples of such groups are 
aryl azides or benzophenones. Additionally, a readily detectable marker such as 
biotin is preferably incorporated into the template. Thus, each compound will 
preferably carry both the photoactivatable and detection groups. Alternatively, if 

25 the ligands for the targets are known, the photoactivatable and detection tags need 
only be incorporated into the ligands, and unlabelled compounds can be screened by 
competition. In this preferred example of the invention the compound or 
combinatorial library is mixed with the Target Library and irradiated to activate the 
photoactivatable group. The cells of the Target Library are lysed and the mixture is 

30 added to an array of spatially encoded antibodies raised against the epitope tags of 
the Target Library. The matrix of antibodies is probed with reagents to detect the 
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marker (for example: streptavidin-peroxidase to detect biotin). The spectrum of 
targets that interact with a given compound or group of compounds can be identified 
by association of the detection marker with a particular subset of epitope tags. If 
necessary, the Target Library can be pie-screened for incorporation of the detection 
5 marker prior to deconvolution on the antibody matrix. 

As used herein, the term "simultaneously expressed" means that numerous 
Uniquely Tagged Target Agents are expressed and purified together without regards 
to maintaining the identity of the subject gene, gene product or membrane-bound 
receptor. 

10 As used herein, the term "assay", "screen" or "testing for biological activity" 

includes any form of testing for pharmaceutically relevant activity. 

In preparing the Target Library, one may initially begin with any manageable 
number of genes, preferably less than one hundred. Preferably these genes are 
grouped for example by expression in tissues of interest or by membership in a 

15 common family, such as kinases, based on sequence homology. Each gene is 
uniquely tagged preferably with an epitope tag and simultaneously expressed. 

In preparing the Target Library, one may initially begin with any manageable 
number of gene products, preferably less than one hundred. Preferably these genes 
are grouped for example by expression in tissues of interest or by membership in a 

20 common family, such as SH2 domains, based on sequence homology. Each gene 
product is uniquely tagged preferably with an epitope tag and simultaneously 
expressed. 

In preparing the Target Library, one may initially begin with any manageable 
number of receptors, preferably less than one hundred. Preferably these receptors are 
25 grouped for example by expression in tissues of interest or by membership in a 
common family, such as 7 transmembrane G-protein coupled receptors, based on 
sequence homology. Each receptor is uniquely tagged preferably with an epitope tag, 
incorporated into a cell and simultaneously expressed. 

To identify compounds that perturb the function of the targets in the Target 
30 Library, an Agent Candidate Pool, preferably one or more chemical compounds or a 
combinatorial compound library, is prepared by standard methods. Active agents are 
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identified by known assay techniques or by engineering a selection step based on the 
anticipated activities contained in the Target Library (as described in the Examples 
below). Deconvolution of the Target library can be accomplished by means of the 
epitope tag, while analytical methods, such as mass spectral techniques, can be used 
5 for Agent Candidate Pool analysis. Thus the identification of active agents and the 
function of the Uniquely Tagged Target Agent are analyzed in parallel. 

Pharmaceutical^ active agents identified by the above process are optionally 
screened in cell based assays for a function of interest, followed by an optional 
functional screen or animal model. Also included within the scope to the present 

10 invention are pharmaceutical^ active agents identified by the processes disclosed 
herein. While is not necessary to identify the specific target (gene, gene product or 
receptor) affected, if toxicity problems arise this information could be potentially 
useful and could readily be obtained retrospectively. Further, parallelism is 
maintained in the secondary screens, with multiple compounds being simultaneously 

1 5 evaluated in a battery of cell-based assays. 

A particularly advantageous aspect of this invention is the ability to maintain 
parallelism in all phases of the discovery process i.e. target selection, expression and 
purification, compound synthesis, primary and secondary screening. Additional 
advantages of this process include: the ability for multiple targets to be 

20 simultaneously evaluated without prior knowledge of function or disease association, 
built-in selectivity profiling, greater chance of uncovering totally unexpected activities 
and disease indications, increased efficiency by utilizing more drug candidates per 
unit time. 

Within the general framework of the parallel approach described above, 
25 numerous methodologies for screening Target Libraries are possible. One such 
methodology is described in Example L All such methodologies are within the 
scope the invention as claimed herein. 

In a further aspect of the invention there is provided a preferred method for 
screening Target Libraries which utilizes expressed Uniquely Tagged Target 
30 Agents, wherein the tagging marker is an epitope tag, spatially encoded in discrete 
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areas of a substrate (referred to herein as Spatially Encoded Target Library or 
SETL). Spatially Encoded Target Libraries for use herein are prepared by: 

i) depositing groupings of antibodies, each grouping being directed against a 
different epitope tag of the Target Library and each epitope tag of the Target 

5 Library corresponding to at least one antibody grouping, in an array (hereinafter 
antibody region), preferably in a mosaic pattern, on a substrate, preferably a 
polyvinylidene difluoride (PVDF) membrane, a nitrocellulose membrane, a 
polystyrene layer or a silicon wafer, 

ii) blocking the antibody region of the membrane with an inert protein and 
10 iii) exposing the Target Library to the antibody region. 

Antibody groupings will selectively bind to their corresponding epitope tags, 
thereinby localizing like groups of Uniquely Tagged Target Agents to a specific 
area of the membrane. The above process is repeated, as desired, in order to 
produce a plurality of functionally similar SETLs. 

15 The deposition of the antibodies to the membrane can be accomplished by 

any means, for example manually spotting the membrane with a capillary pipette, or 
in an automated fashion by 'micro-spotting' in small predetermined deposit areas, for 
example by the use of printing techniques such as those used in ink jet printers. The 
membrane carrying the immobilized antibodies is preferably bonded to a vessel, 

20 such as Merrifield synthesis vessels, flasks, or preferably microtiter 96- well plates, 
prior to step ii above, such that each vessel contains an entire antibody region. 

The Agent Candidate Pool is portioned into the vessels along with other 
assay reagents. Following a detection step, the membrane is imaged to identify 
activity of the Uniquely Tagged Target Agents of the Target Library and the effects 

25 of the Agent Candidates. 

The SETL approach is advantageously used in developing screening assays 
for receptors or other binding proteins, for example SH2 domains, as shown in 
Examples 2, 3 and 4 below. 

Without further elaboration, it is believed that one skilled in the art can, 

30 using the preceding description, utilize the present invention to its fullest extent. 
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The following Examples are, therefore, to be construed as merely illustrative and 
not a limitation of the scope of the present invention in any way. 

AagaaJSanuda 

5 Example 1: Novel osteoblast kinases. 

The cell type responsible for new bone production is the osteoblast As a 
result of the application of large scale sequencing techniques to osteoblast cDNA 
libraries, a number of novel full length kinases have been identified. A number of 
full length kinases from osteoblast- like cells are each uniquely tagged with one or 

10 more epitope tags. For example, a unique identifier tag can be attached in tandem 
with a common purification tag to facilitate the purification process. The genes are 
tagged and simultaneously expressed in an acceptable host, e.g. E. Coli or 
baculovirus. The tags are used to simultaneously purify all members of the kinase 
Target Library as a mixture, which is characterized by Western blotting with 

15 antibodies to the unique tags. Suitable substrates are identified by separately 

immobilizing each member of the Target Library via its epitope tag and adding a 
mixture of phosphate acceptors along with isotopically labelled ATP. 
Phosphorylated substrates are identified by mass spectrometric techniques. 

A compound combinatorial library is synthesized by standard methods. The 
20 Target Library is spatially encoded (as described above) along with suitable 

substrates. The latter are synthesized so that they contain both a phosphate acceptor 
and an epitope tag to co-localize them with the relevant kinase targets. Test 
compounds are added to each array, along with isotopically labelled ATP. The 
arrays are imaged to identify areas where phosphate has been incorporated. 

25 Active agents from Example 1 are optionally screened in a secondary assay, 

such as an osteocalcin release assay which correlates with bone formation. 
Additionally, active agents from Example 1 are screened in a battery of cell based 
assays covering functions unrelated to bone formation. Preferably, said cell based 
assay utilizes reporter gene technology, for example - as described in US patent 

30 numbers 5,401,629 and 5,436,128. Such technology utilizes the promoter regions 
of various genes of interest e.g. osteocalcin, coupled to a reporter gene e.g. 

-9- 
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lucifcrase, and transfected into an appropriate cell line. When this is done in a 
multipotent cell line e.g. embryonic stem cells, the same cells may be used with 
different reporter constructs to simultaneously assay for a variety of functions 
(hereinafter 'assay library*)* Thus the parallel approach of the present invention is 
5 maintained all the way to the secondary assay, giving advantages of efficiency and 
wide profiling of active compounds. 

Preferably the active agents are then evaluated in animal models or in vitro 
functional assays and screened for toxicity. 

10 Example 2; Binding assay. 

The SETL approach could be used to configure a screening assay for 
receptors or other binding proteins, such as SH2 domains. SH2 domains bind 
phosphorylated peptide ligands. To configure a SETL assay for SH2 domains, the 
domains are expressed as epitope tagged fusion proteins. Antibodies to the tags are 

IS deposited on membranes as above. A mixture of tagged SH2 domains is added to 
each well and allowed to bind to the corresponding antibodies. A mixture of 
biotinylated phosphopeptide ligands (one for each SH2 target) is added along with 
test compounds. Binding of the biotinylated peptide ligands is detected by addition 
of streptavidin-biotin complex and colorimetric, fluorescence or enhanced 

20 chemiluminescence (ECL) detection reagents. Imaging of the activity of the 

detection reagents on the membrane surface allows binding of the ligand to each 
individual SH2 domain to be measured. Such imaging could be automated. 

A working demonstration of this assay was accomplished by expressing the 
human sre SH2 domain as a fusion protein. 

25 The fusion protein containing the human src SH2 domain was expressed as 

the general sequence: DET1 -DET2-spacer-ek-src SH2, where DET1, DET2, spacer 
and ek are as described below. DET1 ("defined epitope tag 1") (SEQ ID NO: 1) is 
an 1 1 amino acid sequence found in the Human Immunodeficiency Virus Type 1 
(HIV-1) envelope protein gpl20 (or gpl60). Monoclonal antibodies to various 

30 epitopes of HTV-1 gpl20 (or gpl60) are known in the art, see, for example U.S. 
Patent 5,166,050. One preferred example is monoclonal antibody 178.1 (see, e.g., 

- 10- 
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Thiriart et al., J. Immunol. . 141:1832-1836 (1989)), which was prepared by 
immunization of mice with a yeast-expressed HTV-1 gpl60 molecule from strain 
BH10 (Ratner et al., Nature . 212:277-284 (1985)), This tag was used for detection 
of expression (by Western blot), for purification of the protein (by affinity 
5 chromatography), and for configuring assays in which the fusion protein was 
captured or immobilized using the 178. 1 antibody. DET2 is a hexa-histidinc 
sequence tag (SEQ ID NO: 2) which binds to nickel-containing resins and was used 
for purification purposes. Spacer (SEQ ID NO: 3) was utilized to design a BamHl 
restriction site at the indicated position of the construct. The term -ek> refers to a 

10 recognition sequence (SEQ ID NO: 4) for the enterokinase protease which provides 
for the optional removal of the tags from the src SIC domain, thus producing a src 
SH2 domain that contains no extraneous amino acids. A src SIC domain which 
contains no extraneous amino acids are preferable to tagged protein for 
crystallography studies. 

15 The DNA sequence encoding each DETl-DET2-spacer-ek-src SH2 was 

designed such that the indicated restriction sites (BamHl and Xbal) flank the spacer- 
ek-src SH2 region, thereby allowing different spacer-ek-SH2 contracts to be readily 
substituted into any one of the vectors as described in Procedure 2 below to create a 
DETl-DET2-spacer-ek-SH2 tagged protein. The DNA sequence encoding the 

20 DETl-DET2-spacer-ek-src SH2 construct was also designed such that the entire 
tagged SH2 domain can be moved as an Ndel-Xbal fragment into any expression 
vector containing an Ndel site at an appropriate distance downstream of IL 
transcription and translation regulatory sequences and a downstream cloning site 
compatible with Xbal. Although any suitable vector would yield similar results(e.g., 

25 pET-1 la; Novagen, Inc.), the vector used in the instant experiments was E» £0U 

expression vector pEAlKnRBS3. This vector is a derivative of the series of vectors 
described in Shatzman, A, Gross, M, and Rosenberg, M, 1990, "Expression using 
vectors with phage lambda regulatory sequences", In: Current Protocols in Molecular 
Biology (F.A. Ausubel et al , eds.), pp. 16.3.1-16.3.1 1, Greene Publishing and Wiley- 

30 Interscience, N.Y. (hereinafter F.A. Ausubel et al.). The specific vector 

pEA!KnRBS3 is described in Bergsma et al, 1991, J. Biol. Chem. 266:23204-23214. 
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The procedures below describe the expression of chicken sre and the human 
sre SH2 domains. First, the chicken sre SH2 domain was expressed as DET1- 
DET2-spacer-SH2. Then, the other was inserted into this vector in place of chicken 
sre to express proteins in the form DET1 -DET2-spacer-ek-src SH2 as described in 
5 procedures 1 and 2 below. 



Procedure 1 : Cloning and Expression of chicken sre SH2 domain containing tags 
DET1 and DET2 (DETl-DET2-spacer-SH2). 

A DNA sequence encoding the tagged protein DET 1 -DET2-spacer-SH2 was 
10 PCR amplified from a cDNA clone containing the chicken sre gene (p5H; Levy et al 
1986. Proc. NatL AcaiL &ZL USA J£:4228) by methods well known to those skilled 
in the art by using the following primers: 



5' 

15 TT CCATATG AAAAGTATTCCiTAT^ 

CCACCACCACX jGG ATCC CCGCTG A AG AGTHCIT A CTTT 3* (SEQ ID NO: 7) 

The underlined sites are an Ndel recognition site (5 f ) and a BamHI 
recognition site (3 f ). 

20 

5* GGAA TTCTAGAT TACTAGGACGTGGGGGAGArGTT 3* (SEQ ID NO: 8) 



The underlined region is an Xbal recognition site. 

The PCR product was digested with Ndel and Xbal, followed by isolation of 
25 the digested fragment on an agarose gel. The fragment was ligated into Ndel-Xbal- 
digested pEAlKnRBS3 vector (Bergsma et al, siIEia) that had been agarose gel 
purified as a 6.5 kbp fragment. The ligation reaction was used to transform fL coli 
MM294cI* (F.A. Ausubel et al., jaiEia). A plasmid containing an insertion of the 
correct fragment was identified and confirmed by DNA sequencing. The resultant 
30 plasmid encodes DETl-DET2-spacer-SH2 under the control of the phage lamda P L 
promoter and regulatory system. Plasmid DNA was purified from MM294cr and 
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used to transform IL coli strain AR120. In this host strain, expression of the phage 
promoter can be induced by addition of nalidixic acid to the growing culture as 
described in F.A. Ausubel et al, supra . Nalidixic acid induction of AR120 
containing this plasmid, followed by analysis of the cellular proteins on an SDS- 
5 polyacrylamide gel stained with Coomassie Blue (F.A. Ausubel et al., supraV 
resulted in appearance of a protein band with an apparent molecular weight of 
15,000; this band was not seen in uninduced cells or in induced cells containing 
pEAlKnRBS3 lacking the PGR amplified fragment. Western blotting confirmed 
that the induced protein band reacted with the anti-DETl monoclonal antibody 
10 178.1. 

Procedure 2 : Cloning, expression and purification of human src SH2 domain 
containing tags and an enterokinase proteolytic cleavage site (DETl-DET2-spacer- 
ek-src SH2). 

15 A DNA sequence encoding protein ek-src SH2 was PCR amplified from a 

cDNA clone containing the human src gene (c-src SH2 DNA sequence identical to 
that described in Takeya,T. and Hanafusa, H, 1983 Cell 32:881-890) using the 
following primers: 

20 5' CX jGGATCCT GGACXIACGACGACAAAGCT 3* 
(SEQ ID NO: 9) 

The underlined site is a BamHI recognition site. 

25 5* GGAA TTCTAGA CTATTAGGACGTGGGGCACACGGT 3* (SEQ ID NO: 10) 

The underlined region is an Xbal recognition site. 

The PCR product was digested with BamHI and Xbal, followed by isolation 
30 of the digested fragment on an agarose gel. The fragment was ligated into BamHI- 
Xbal-digested expression vector containing the tagged chicken src gene DET1- 
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DET2-spacer-SH2 described in Procedure 1 above. In that vector, the BamHI site 
is located between the coding regions for DET2 and SH2, and the Xbal site is 
located after the 3* end of the SH2 coding region. The ligation reaction was used to 
transform £Qli MM294cF. The construct DETl-DET2-spacer-ek-src SH2 was 
5 confirmed by DNA sequencing (SEQ ID NO: 5) and induced in coli strain 
AR120 as described in Procedure 1 above. A Coomassie-Blue-stained, Western- 
blot-positive induced protein band with an apparent molecular weight of 16,000 was 
observed after nalidixic acid induction. 

Cells were lysed at neutral pH by sonication in the presance of lysozyme. 
10 After centrifugation, the soluble extract was chromatographed on a Ni~NTA column. 
After washing the column with equilibration buffer (Tris buffer pH 8 containing 0.5 
M NaCl) and the same buffer containing 15 mM imidazole, the protein, DET1- 
DET2-spacer-ek-src SH2, was eluted in highly purified form with 25 mM imidazole 
in equilibration buffer. 

15 After manually spotting Mab 178.1 on a PVDF membrane and addition of 

tagged src SH2 domain, probing with a biotinylated phosphopeptide ligand for src 
SH2 (SEQ ID NO: 6), followed by ECL or colorimetric detection showed a clear 
signal. No signal was seen with an equivalent amount of irrelevant antibody. 
Binding of the biotinylated phosphopeptide to the immobilized src SH2 domain was 

20 inhibited by addition of a non-biotinylated phosphopeptide competitor of identical 
amino acid sequence to the biotinylated molecule. 

Example 3; Protease assay. 

As with the kinase assay, it is necessary to identify the optimum substrates 
25 for the target library components before configuring the SETL screen. Once this is 
known, substrates are synthesized with the appropriate tag on one side of the 
cleavage site and a detection label such as biotin on the other. Once immobilized, 
protease activity can be identified by loss of the label from the target site. 

30 Example 4: Biopanninp method 

Screening of bead-based compound libraries with multiple targets. 
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Each gene of a Target Library is expressed with two epitope tags: a common 
tag C which is the same for all genes and a unique tag T which is different for each 
gene. A combinatorial library is synthesized on beads by standard methods. The 
combinatorial library is screened with the Target Library by measurement of the 
5 association of tag C with the beads and separation of individual positive beads either 
manually or by automated methods. Following separation, the proteins are eluted 
from each positive bead, for example by low pH washing, and allowed to bind to an 
array of spatially encoded antibodies against the unique tags T. The location, thus 
identification, of the spectrum of proteins from the positive beads is determined by 

10 probing the matrix with an antibody against C, such probe being labeled with a 
readily detectable marker such as biotin, with subsequent readout. Further, active 
compounds are identified by cleavage from the positive beads with subsequent 
chemical/physical analysis. 

While the preferred embodiments of the invention are illustrated by the 

15 above, it is to be understood that the invention is not limited to the precise 

instructions herein disclosed and that the right to all modifications coming within 
the scope of the following claims is reserved. 
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SEQUENCE LISTING 

5 (1) GENERAL INFORMATION 

(i) APPLICANT: DUNNINGTON , DAMIEN 

(ii) TITLE OF THE INVENTION: A PROCESS FOR IDENTIFYING 
10 PHARMACEUTIC ALLY ACTIVE AGENTS 

(iii) NUMBER OF SEQUENCES: 10 

(iv) CORRESPONDENCE ADDRESS: 

15 <A) ADDRESSEE: SmithKline Beecham Corporation 

(B) STREET: 709 Swede land Road 

(C) CITY: King of Prussia 

(D) STATE: PA 

(E) COUNTRY: USA 
20 (F) ZIP: 19406 

(V) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Diskette 

(B) COMPUTER: IBM Compatible 
25 (C) OPERATING SYSTEM: DOS 

(D) SOFTWARE: FastSEQ Version 1.5 

(vi) CURRENT APPLICATION DATA: 
(A) APPLICATION NUMBER: 
30 (B) FILING DATE: 

(C) CLASSIFICATION: 

<vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: 08/497,357 
35 (B) FILING DATE: 30-Jun-1995 



(viii) ATTORNEY /AGENT INFORMATION: 
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10 



25 



(A) NAME: Dustman, Wayne J 

(B) REGISTRATION NUMBER: 33,870 

(C) REFERENCE /DOCKET NUMBER: P50323-3 

<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 610-270-5023 

(B) TELEFAX: 

(C) TELEX: 



(2) INFORMATION FOR SEQ ID NO:l: 



(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 11 amino acids 

15 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 
20 (iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
(ix) FEATURE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:l: 



Lys Ser lie Arg lie Gin Arg Gly Pro Gly Arg 
15 10 

30 

(2) INFORMATION FOR SEQ ID NO:2: 

(i) SEQUENCE CHARACTERISTICS: 
(A) LENGTH: 6 amino acids 
35 (B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

5 (v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

10 His His His His His His 
1 5 

(2) INFORMATION FOR SEQ ID NO: 3: 

15 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: internal 
25 (vi) ORIGINAL SOURCE: 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 3: 

Gly lie Leu 
30 l 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 
35 (A) LENGTH: 5 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
5 (iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: internal 
<vi) ORIGINAL SOURCE: 



<xi) SEQUENCE 

10 

Asp Asp Asp Asp Lys 
1 5 



DESCRIPTION: SEQ ID NO: 4: 



(2) INFORMATION FOR SEQ ID NO:5: 

15 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 130 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
20 (D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

25 (v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5: 



30 Met Lys Ser lie Arg lie Gin Arg 
1 5 
His His Gly lie Leu Asp Asp Asp 
20 

Gly Lys lie Thr Arg Arg Glu Ser 
35 35 40 

Asn Pro Arg Gly Thr Phe Leu Val 
50 55 



Gly Pro Gly Arg His His His His 

10 15 
Asp Lys Ala Glu Glu Trp Tyr Phe 
25 30 
Glu Arg Leu Leu Leu Asn Ala Glu 
45 

Arg Glu Ser Glu Thr Thr Lys Gly 
60 
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Ala Tyr Cys Leu Ser Val Ser Asp 
65 70 
Val Lys His Tyr Lys lie Arg Lys 
85 

5 Thr Ser Arg Thr Gin Phe Asn Ser 
100 

Ser Lys His Ala Asp Gly Leu Cys 
115 120 

Thr Ser 
10 130 



Phe Asp Asn Ala Lys Gly Leu Asn 

75 80 
Leu Asp Ser Gly Gly Phe Tyr lie 

90 95 
Leu Gin Gin Leu Val Ala Tyr Tyr 
105 110 
His Arg Leu Thr Thr Val Cys Pro 
125 



(2) INFORMATION FOR SEQ ID NO: 6: 



(i) SEQUENCE CHARACTERISTICS : 
15 (A) LENGTH: 19 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



20 (ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 
<iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: internal 

(vi) ORIGINAL SOURCE: 
25 <ix) FEATURE: 



(A) NAME /KEY : Other 

(B) LOCATION: 4... 4 

(D) OTHER INFORMATION: phosphorylated tyrosine r 
30 esidue 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 



Glu Pro Gin Tyr Glu Glu lie Pro lie Tyr Leu 
35 1 5 10 15 
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(2) INFORMATION FOR SEQ ID NO: 7: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 87 base pairs 
5 (B) TYPE: nucleic acid 

(C) STRAND EDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 
10 (iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE : NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 

15 <xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 

TTCCATATGA AAAGTATTCG TATTCAGCGT GGCCCGGGCC GTCACCACCA CCACCACCAC 
60 

GGGATCCCCG CTGAAGAGTG GTACTTT 
20 87 

(2) INFORMATION FOR SEQ ID NO: 8: 

(i) SEQUENCE CHARACTERISTICS: 
25 (A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

30 (ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

(vi) ORIGINAL SOURCE: 



35 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 8 : 
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GGAATTCTAG ATTACTAGGA CGTGGGGCAG ACGTT 
38 



(2) INFORMATION FOR SEQ ID NO: 9: 

5 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 
10 <D) TOPOLOGY: linear 



(ii) MOLECULE TYPE: CDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 
15 (v) FRAGMENT TYPE: 

<vi) ORIGINAL SOURCE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 9: 



20 CGGGATCCTG GACGACGACG ACAAAGCTGA GGAGTGGTAT TTT 
46 



(2) INFORMATION FOR SEQ ID NO: 10: 



25 (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
<D) TOPOLOGY: linear 

30 

(ii) MOLECULE TYPE: cDNA 

(iii) HYPOTHETICAL: NO 

(iv) ANT I SENSE: NO 

(v) FRAGMENT TYPE: 

35 (vi) ORIGINAL SOURCE: 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 10: 
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GGAATTCTAG ACTATTAGGA CGTGGGGCAC ACGGT 
38 
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What is claimed is: 



L A process for identifying pharmaceutically active agents which 
comprises the steps of: 
5 a) simultaneously expressing a plurality of Uniquely Tagged Target 

Agents to form a Target Library; 

b) preparing an Agent Candidate Pool; and 

c) testing the Target Library and the Agent Candidate Pool in an assay 
which identifies agents having desired characteristics. 

10 

2. A process for spatially encoding Target libraries which comprises the 
steps of: 

a) depositing a plurality of antibodies, each member being directed against a 
different epitope tag of a Target Library and each epitope tag of the Target Library 

15 corresponding to at least one antibody grouping, in an array, on a substrate; 

b) blocking the antibody region of the membrane with an inert protein; and 

c) exposing a Target Library to the antibody region. 



20 



25 



3. A spatially Encoded Target Library prepared as in claim 2. 

4. A process for simultaneously expressing a plurality of uniquely 
tagged genes to form a Target Library which comprises engineering each member of 
the plurality with an epitope tag capable of being recognized by a specified 
antibody, prior to simultaneously expressing the plurality. 

5. A Target Library prepared as in claim 4. 



6. A method of screening bead based combinatorial compound libraries 
which comprises the steps of: 
30 a) expressing a Target Library wherein to each member is affixed a 

common tag and a unique tag; 
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b) exposing a bead based combinatorial compound library to the Target 
Library and separating the positive hits by detection of the common tag; 

c) during the active proteins off of the beads; and 

d) identification by means of the unique tag. 

5 

7. Pharmaceutical! y active agents identified by the process of claim 1. 
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